Internet Info 1997 December

home *** CD-ROM | disk | FTP | other *** search

/ Internet Info 1997 December / Internet_Info_CD-ROM_Walnut_Creek_December_1997.iso / ietf / urn / urn-archives / urn-ietf.archive.9610 / 000153_owner-urn-ietf _Thu Oct 31 15:56:36 1996.msg < prev

Wrap

Internet Message Format | 1997-02-19 | 2KB

Received: (from daemon@localhost) by services.bunyip.com (8.6.10/8.6.9) id PAA19876 for urn-ietf-out; Thu, 31 Oct 1996 15:56:36 -0500 Received: from mocha.bunyip.com (mocha.Bunyip.Com [192.197.208.1]) by services.bunyip.com (8.6.10/8.6.9) with SMTP id PAA19871 for <urn-ietf@services.bunyip.com>; Thu, 31 Oct 1996 15:56:34 -0500 Received: from windrose.omaha.ne.us by mocha.bunyip.com with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA28840 (mail destined for urn-ietf@services.bunyip.com); Thu, 31 Oct 96 15:56:31 -0500 Message-Id: <9610312056.AA28840@mocha.bunyip.com> Received: by privateer.windrose.omaha.ne.us; Thu Oct 31 14:55 CST 1996 From: "Ryan Moats" <jayhawk@ds.internic.net> To: "Gregory J. Woodhouse" <gjw@wnetc.com>, "Martin J Duerst" <mduerst@ifi.unizh.ch> Cc: "urn-ietf@bunyip.com" <urn-ietf@bunyip.com> Date: Thu, 31 Oct 96 14:56:43 Priority: Normal X-Mailer: PMMail 1.52 For OS/2 UNREGISTERED SHAREWARE Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Subject: [URN] UTF-8/ASCII (was: New Syntax Draft) Sender: owner-urn-ietf@services.bunyip.com Precedence: bulk Reply-To: "Ryan Moats" <jayhawk@ds.internic.net> Errors-To: owner-urn-ietf@bunyip.com On Thu, 31 Oct 1996 05:54:38 -0800 (PST), Gregory J. Woodhouse wrote: >On Thu, 31 Oct 1996, Martin J Duerst wrote: > >> Gregory Woodhouse wrote: >> >> This difference is always absolutely clear! A string that only contains >> bytes with the high bit '0' (7-bit bytes) is ASCII. As soon as a high >> bit is set to '1', it's not ASCII anymore. It may be UTF-8, or may not. >> >I understand this point, but I'm not talking about UTF-8 itself, but UTF-8 >encoded into ASCII. At the risk of being obvious, any UTF-8 encoded to ASCII (according to the syntax doc) is going to have a leading octet encoded as something in the %80-%FF range. Anything in the %00-%7F range is an unsafe (based on one definition or another) ASCII character. However, I think this is somewhat moot (I think this argues for removing the UTF-8 decoding requirement.) Ryan